Skip to content

gh-109329: Support for basic pystats for Tier 2#109913

Merged
brandtbucher merged 17 commits intopython:mainfrom
mdboom:tier2-stats
Oct 4, 2023
Merged

gh-109329: Support for basic pystats for Tier 2#109913
brandtbucher merged 17 commits intopython:mainfrom
mdboom:tier2-stats

Conversation

@mdboom
Copy link
Contributor

@mdboom mdboom commented Sep 26, 2023

This just implements a set of tier 2 pystats mentioned in #109329.

This implements:

  • Total micro-ops executed
  • Total number of traces started
  • Total number of traces created
  • Optimization attempts
  • Per uop execution counts, like we have for tier 1 instructions.
  • A histogram of uops executed per trace
  • Trace too long
  • Unsupported opcode (maybe even with counters for each offender)
  • Inner loop found
  • Too many frame pushes (currently capped at a depth of 5)
  • Too many frame pops (if we return from the original frame)
  • Recursive function call

It also fixes a "bug" where specialization "hits" in the running Tier 2 interpreter would count against the Tier 1 stats.

Not implemented (since it will require reworking of DEOPT_IF calls):

  • Exit reason counts: polymorphism vs branch mis-prediction

Example output (for nbody benchmark):

Optimization (Tier 2) stats

statistics about the Tier 2 optimizer

Overall stats

overall stats
Count Ratio
Optimization attempts 5
Traces created 5 100.0%
Traces executed 1,600,070
Uops executed 65,202,621 40
Trace stack overflow 0
Trace stack underflow 0
Trace too long 5
Inner loop found 0
Recursive call 0

Trace length histogram

Range Count Ratio
<= 4 0 0.0%
<= 8 0 0.0%
<= 16 0 0.0%
<= 32 0 0.0%
<= 64 5 100.0%

Optimized trace length histogram

Range Count Ratio
<= 4 0 0.0%
<= 8 0 0.0%
<= 16 0 0.0%
<= 32 0 0.0%
<= 64 5 100.0%

Trace run length histogram

Range Count Ratio
<= 4 0 0.0%
<= 8 0 0.0%
<= 16 0 0.0%
<= 32 0 0.0%
<= 64 1,600,070 100.0%

Uop stats

uop stats
Uop Count Self Cumulative
STORE_FAST 14,900,598 22.9% 22.9%
_SET_IP 14,100,583 21.6% 44.5%
LOAD_FAST 10,100,513 15.5% 60.0%
_GUARD_BOTH_FLOAT 3,700,274 5.7% 65.6%
_BINARY_OP_SUBTRACT_FLOAT 2,900,138 4.4% 70.1%
UNPACK_SEQUENCE_TUPLE 2,400,092 3.7% 73.8%
UNPACK_SEQUENCE_LIST 2,400,092 3.7% 77.5%
_POP_JUMP_IF_TRUE 1,700,049 2.6% 80.1%
_EXIT_TRACE 1,600,070 2.5% 82.5%
_ITER_CHECK_LIST 1,600,065 2.5% 85.0%
_IS_ITER_EXHAUSTED_LIST 1,600,065 2.5% 87.4%
COPY 1,599,948 2.5% 89.9%
_ITER_NEXT_LIST 1,400,053 2.1% 92.0%
UNPACK_SEQUENCE_TWO_TUPLE 1,000,039 1.5% 93.6%
LOAD_CONST 800,001 1.2% 94.8%
SWAP 799,974 1.2% 96.0%
BINARY_SUBSCR_LIST_INT 799,974 1.2% 97.2%
_BINARY_OP_MULTIPLY_FLOAT 400,095 0.6% 97.9%
_BINARY_OP_ADD_FLOAT 400,041 0.6% 98.5%
STORE_SUBSCR_LIST_INT 399,987 0.6% 99.1%
POP_TOP 200,017 0.3% 99.4%
_ITER_CHECK_RANGE 99,984 0.2% 99.5%
_IS_ITER_EXHAUSTED_RANGE 99,984 0.2% 99.7%
_ITER_NEXT_RANGE 99,979 0.2% 99.8%
GET_ITER 99,979 0.2% 100.0%
BINARY_OP 27 0.0% 100.0%

Unsupported opcodes

unsupported opcodes
Opcode Count

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants